Prediction of PKCθ Inhibitory Activity Using the Random Forest Algorithm
نویسندگان
چکیده
This work is devoted to the prediction of a series of 208 structurally diverse PKCθ inhibitors using the Random Forest (RF) based on the Mold(2) molecular descriptors. The RF model was established and identified as a robust predictor of the experimental pIC(50) values, producing good external R(2) (pred) of 0.72, a standard error of prediction (SEP) of 0.45, for an external prediction set of 51 inhibitors which were not used in the development of QSAR models. By using the RF built-in measure of the relative importance of the descriptors, an important predictor-the number of group donor atoms for H-bonds (with N and O)-has been identified to play a crucial role in PKCθ inhibitory activity. We hope that the developed RF model will be helpful in the screening and prediction of novel unknown PKCθ inhibitory activity.
منابع مشابه
Application of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملارزیابی صحت پیشبینی ژنومی در معماریهای مختلف ژنومی صفات کمی و آستانهای با جانهی دادههای ژنومی شبیهسازیشده، توسط روش جنگل تصادفی
Genomic selection is a promising challenge for discovering genetic variants influencing quantitative and threshold traits for improving the genetic gain and accuracy of genomic prediction in animal breeding. Since a proportion of genotypes are generally uncalled, therefore, prediction of genomic accuracy requires imputation of missing genotypes. The objectives of this study were (1) to quantify...
متن کاملImplementation of Random Forest Algorithm in Order to Use Big Data to Improve Real-Time Traffic Monitoring and Safety
Nowadays the active traffic management is enabled for better performance due to the nature of the real-time large data in transportation system. With the advancement of large data, monitoring and improving the traffic safety transformed into necessity in the form of actively and appropriately. Per-formance efficiency and traffic safety are considered as an im-portant element in measuring the pe...
متن کاملDiagnosis of Diabetes Using a Random Forest Algorithm
Background: Diabetes is the fourth leading cause of death in the world. And because so many people around the world have the disease, or are at risk for it, diabetes can be called the disease of the century. Diabetes has devastating effects on the health of people in the community and if diagnosed late, it can cause irreparable damage to vision, kidneys, heart, arteries and so on. Therefore, it...
متن کاملPrediction of maximum surface settlement caused by earth pressure balance shield tunneling using random forest
Due to urbanization and population increase, need for metro tunnels, has been considerably increased in urban areas. Estimating the surface settlement caused by tunnel excavation is an important task especially where the tunnels are excavated in urban areas or beneath important structures. Many models have been established for this purpose by extracting the relationship between the settlement a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 11 شماره
صفحات -
تاریخ انتشار 2010